The Blessing of Dimensionality: Separation Theorems in the Thermodynamic Limit
نویسندگان
چکیده
We consider and analyze properties of large sets of randomly selected (i.i.d.) points in high dimensional spaces. In particular, we consider the problem of whether a single data point that is randomly chosen from a finite set of points can be separated from the rest of the data set by a linear hyperplane. We formulate and prove stochastic separation theorems, including: 1) with probability close to one a random point may be separated from a finite random set by a linear functional; 2) with probability close to one for every point in a finite random set there is a linear functional separating this point from the rest of the data. The total number of points in the random sets are allowed to be exponentially large with respect to dimension. Various laws governing distributions of points are considered, and explicit formulae for the probability of separation are provided. These theorems reveal an interesting implication for machine learning and data mining applications that deal with large data sets (big data) and high-dimensional data (many attributes): simple linear decision rules and learning machines are surprisingly efficient tools for separating and filtering out arbitrarily assigned points in large dimensions.
منابع مشابه
Blessing of dimensionality: mathematical foundations of the statistical physics of data
The concentrations of measure phenomena were discovered as the mathematical background to statistical mechanics at the end of the nineteenth/beginning of the twentieth century and have been explored in mathematics ever since. At the beginning of the twenty-first century, it became clear that the proper utilization of these phenomena in machine learning might transform the curse of dimensionalit...
متن کاملWiener Way to Dimensionality
This note introduces a new general conjecture correlating the dimensionality dT of an infinite lattice with N nodes to the asymptotic value of its Wiener Index W(N). In the limit of large N the general asymptotic behavior W(N)≈Ns is proposed, where the exponent s and dT are related by the conjectured formula s=2+1/dT allowing a new definition of dimensionality dW=(s-2)-1. Being related to the t...
متن کاملCOUPLED FIXED POINT THEOREMS FOR RATIONAL TYPE CONTRACTIONS VIA C-CLASS FUNCTIONS
In this paper, by using C-class functions, we will present a coupled xed problem in b-metric space for the single-valued operators satisfying a generalized contraction condition. First part of the paper is related to some xed point theorems, the second part presents the uniqueness and existence for the solution of the coupled xed point problem and in the third part we...
متن کاملImpact of linear dimensionality reduction methods on the performance of anomaly detection algorithms in hyperspectral images
Anomaly Detection (AD) has recently become an important application of hyperspectral images analysis. The goal of these algorithms is to find the objects in the image scene which are anomalous in comparison to their surrounding background. One way to improve the performance and runtime of these algorithms is to use Dimensionality Reduction (DR) techniques. This paper evaluates the effect of thr...
متن کاملSelective Cloud Point Extraction and Preconcentration of Copper by the Use of Dithizone as a Complexing Agent
The aim of this work was to develop a selective cloud point extraction method for the separation and preconcentration of copper(II) prior to spectrophotometric determination. For this purpose dithizone was used as a complexing agent and the experimental solution was acidified with sulfuric acid. Triton X-114 was used as a surfactant and after phase separation, based on the cloud point of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1610.00494 شماره
صفحات -
تاریخ انتشار 2016